Goto

Collaborating Authors

 trial 1



Sensory-Motor Control with Large Language Models via Iterative Policy Refinement

arXiv.org Artificial Intelligence

We propose a method that enables large language models (LLMs) to control embodied agents through the generation of control policies that directly map continuous observation vectors to continuous action vectors. At the outset, the LLMs generate a control strategy based on a textual description of the agent, its environment, and the intended goal. This strategy is then iteratively refined through a learning process in which the LLMs are repeatedly prompted to improve the current strategy, using performance feedback and sensory-motor data collected during its evaluation. The method is validated on classic control tasks from the Gymnasium library and the inverted pendulum task from the MuJoCo library. The approach proves effective with relatively compact models such as GPT-oss:120b and Qwen2.5:72b. In most cases, it successfully identifies optimal or near-optimal solutions by integrating symbolic knowledge derived through reasoning with sub-symbolic sensory-motor data gathered as the agent interacts with its environment.


A Full Method Description ADVI Baseline Uses a full-rank Gaussian initialized to standard

Neural Information Processing Systems

Equation (3); uses our comprehensive step-size search for updates. Importance-weighted training is used with M = 10 (optimizes IW-ELBO with M = 10). Importance-weighted sampling is not used; importance-weighted training is not used. Importance-weighted sampling is not used; importance-weighted training is not used. Importance-weighted training is used with M = 10.


Regularising NARX models with multi-task learning

arXiv.org Artificial Intelligence

A Nonlinear Auto-Regressive with eXogenous inputs (NARX) model can be used to describe time-varying processes; where the output depends on both previous outputs and current/previous external input variables. One limitation of NARX models is their propensity to overfit and result in poor generalisation for future predictions. The proposed method to help to overcome the issue of overfitting is a NARX model which predicts outputs at both the current time and several lead times into the future. This is a form of multi-task learner (MTL); whereby the lead time outputs will regularise the current time output. This work shows that for high noise level, MTL can be used to regularise NARX with a lower Normalised Mean Square Error (NMSE) compared to the NMSE of the independent learner counterpart.


Predicting change in time production -- A machine learning approach to time perception

arXiv.org Artificial Intelligence

Time perception research has advanced significantly over the years. However, some areas remain largely unexplored. This study addresses two such under-explored areas in timing research: (1) A quantitative analysis of time perception at an individual level, and (2) Time perception in an ecological setting. In this context, we trained a machine learning model to predict the direction of change in an individual's time production. The model's training data was collected using an ecologically valid setup. We moved closer to an ecological setting by conducting an online experiment with 995 participants performing a time production task that used naturalistic videos (no audio) as stimuli. The model achieved an accuracy of 61%. This was 10 percentage points higher than the baseline models derived from cognitive theories of timing. The model performed equally well on new data from a second experiment, providing evidence of its generalization capabilities. The model's output analysis revealed that it also contained information about the magnitude of change in time production. The predictions were further analysed at both population and individual level. It was found that a participant's previous timing performance played a significant role in determining the direction of change in time production. By integrating attentional-gate theories from timing research with feature importance techniques from machine learning, we explained model predictions using cognitive theories of timing. The model and findings from this study have potential applications in systems involving human-computer interactions where understanding and predicting changes in user's time perception can enable better user experience and task performance.


Precise Robotic Weed Spot-Spraying for Reduced Herbicide Usage and Improved Environmental Outcomes -- A Real-World Case Study

arXiv.org Artificial Intelligence

Precise robotic weed control plays an essential role in precision agriculture. It can help significantly reduce the environmental impact of herbicides while reducing weed management costs for farmers. In this paper, we demonstrate that a custom-designed robotic spot spraying tool based on computer vision and deep learning can significantly reduce herbicide usage on sugarcane farms. We present results from field trials that compare robotic spot spraying against industry-standard broadcast spraying, by measuring the weed control efficacy, the reduction in herbicide usage, and the water quality improvements in irrigation runoff. The average results across 25 hectares of field trials show that spot spraying on sugarcane farms is 97% as effective as broadcast spraying and reduces herbicide usage by 35%, proportionally to the weed density. For specific trial strips with lower weed pressure, spot spraying reduced herbicide usage by up to 65%. Water quality measurements of irrigation-induced runoff, three to six days after spraying, showed reductions in the mean concentration and mean load of herbicides of 39% and 54%, respectively, compared to broadcast spraying. These promising results reveal the capability of spot spraying technology to reduce herbicide usage on sugarcane farms without impacting weed control and potentially providing sustained water quality benefits.


Happily Error After: Framework Development and User Study for Correcting Robot Perception Errors in Virtual Reality

arXiv.org Artificial Intelligence

While we can see robots in more areas of our lives, they still make errors. One common cause of failure stems from the robot perception module when detecting objects. Allowing users to correct such errors can help improve the interaction and prevent the same errors in the future. Consequently, we investigate the effectiveness of a virtual reality (VR) framework for correcting perception errors of a Franka Panda robot. We conducted a user study with 56 participants who interacted with the robot using both VR and screen interfaces. Participants learned to collaborate with the robot faster in the VR interface compared to the screen interface. Additionally, participants found the VR interface more immersive, enjoyable, and expressed a preference for using it again. These findings suggest that VR interfaces may offer advantages over screen interfaces for human-robot interaction in erroneous environments.


Advances in Black-Box VI: Normalizing Flows, Importance Weighting, and Optimization

arXiv.org Machine Learning

Recent research has seen several advances relevant to black-box VI, but the current state of automatic posterior inference is unclear. One such advance is the use of normalizing flows to define flexible posterior densities for deep latent variable models. Another direction is the integration of Monte-Carlo methods to serve two purposes; first, to obtain tighter variational objectives for optimization, and second, to define enriched variational families through sampling. However, both flows and variational Monte-Carlo methods remain relatively unexplored for black-box VI. Moreover, on a pragmatic front, there are several optimization considerations like step-size scheme, parameter initialization, and choice of gradient estimators, for which there are no clear guidance in the existing literature. In this paper, we postulate that black-box VI is best addressed through a careful combination of numerous algorithmic components. We evaluate components relating to optimization, flows, and Monte-Carlo methods on a benchmark of 30 models from the Stan model library. The combination of these algorithmic components significantly advances the state-of-the-art "out of the box" variational inference.